In this project, I investigate the “births” dataset with the goal of identifying the variables which have a statistically significant effect on birth weight.
Initial exploration showed that while most missing data was concentrated in the dad_age variable, most variables did not have a large amount of missing data (missing: 2%, observed: 98%).
head(births)
## dad_age mom_age maturity len_preg is_premie num_visits marital
## 1 NA 13 younger 39 fullterm 10 unmarried
## 2 NA 14 younger 42 fullterm 15 unmarried
## 3 19 15 younger 37 fullterm 11 unmarried
## 4 21 15 younger 41 fullterm 6 unmarried
## 5 NA 15 younger 39 fullterm 9 unmarried
## 6 NA 15 younger 38 fullterm 19 unmarried
## mom_wt_gain bwt low_bwt sex smoke mom_white mom_age_level
## 1 38 7.63 notlow male nonsmoker nonwhite teens
## 2 20 7.88 notlow male nonsmoker nonwhite teens
## 3 38 6.63 notlow female nonsmoker white teens
## 4 34 8.00 notlow male nonsmoker white teens
## 5 27 6.38 notlow female nonsmoker nonwhite teens
## 6 22 5.38 low male nonsmoker nonwhite teens
Summary:
summary(births)
## dad_age mom_age maturity len_preg
## Min. :14.00 Min. :13 younger :867 Min. :20.00
## 1st Qu.:25.00 1st Qu.:22 advanced:133 1st Qu.:37.00
## Median :30.00 Median :27 Median :39.00
## Mean :30.26 Mean :27 Mean :38.33
## 3rd Qu.:35.00 3rd Qu.:32 3rd Qu.:40.00
## Max. :55.00 Max. :50 Max. :45.00
## NA's :171 NA's :2
## is_premie num_visits marital mom_wt_gain
## fullterm:846 Min. : 0.0 married :613 Min. : 0.00
## premie :152 1st Qu.:10.0 unmarried:386 1st Qu.:20.00
## NA's : 2 Median :12.0 NA's : 1 Median :30.00
## Mean :12.1 Mean :30.33
## 3rd Qu.:15.0 3rd Qu.:38.00
## Max. :30.0 Max. :85.00
## NA's :9 NA's :27
## bwt low_bwt sex smoke mom_white
## Min. : 1.000 notlow:889 female:503 nonsmoker:873 nonwhite:284
## 1st Qu.: 6.380 low :111 male :497 smoker :126 white :714
## Median : 7.310 NA's : 1 NA's : 2
## Mean : 7.101
## 3rd Qu.: 8.060
## Max. :11.750
##
## mom_age_level
## teens :110
## early20s:281
## late20s :257
## early30s:219
## 35+ :133
##
##
Standard Deviation:
sapply(numeric_births, sd, na.rm=TRUE) # Standard Deviation
## dad_age mom_age maturity len_preg is_premie
## 6.7637662 6.2135826 0.3397446 2.9315529 0.3594961
## num_visits marital mom_wt_gain bwt low_bwt
## 3.9549337 0.4871648 14.2412966 1.5088603 0.3142893
## sex smoke mom_white mom_age_level
## 0.5002412 0.3321577 0.4514352 1.2137616
Variance:
sapply(numeric_births, var, na.rm=TRUE) # Variance
## dad_age mom_age maturity len_preg is_premie
## 45.74853295 38.60860861 0.11542643 8.59400245 0.12923741
## num_visits marital mom_wt_gain bwt low_bwt
## 15.64150078 0.23732951 202.81452933 2.27665926 0.09877778
## sex smoke mom_white mom_age_level
## 0.25024124 0.11032877 0.20379375 1.47321722
Interquartile Range:
sapply(numeric_births, IQR, na.rm=TRUE) # Interquartile Range
## dad_age mom_age maturity len_preg is_premie
## 10.00 10.00 0.00 3.00 0.00
## num_visits marital mom_wt_gain bwt low_bwt
## 5.00 1.00 18.00 1.68 0.00
## sex smoke mom_white mom_age_level
## 1.00 0.00 1.00 2.00
## Warning: Removed 171 rows containing non-finite values (stat_bin).
## Warning: Removed 2 rows containing non-finite values (stat_bin).
## Warning: Removed 9 rows containing non-finite values (stat_bin).
## Warning: Removed 27 rows containing non-finite values (stat_bin).
Several variables had strong relationships in this dataset, though many of these relationships are due to multicollinearity between variables. The variables related due to multicollinearity are is_premie and len_preg (premies by definition are infants with a lower length of pregnancy), low_bwt and bwt (low_bwt is a category based on birth weight), and maturity with mom_age and mom_age_level (maturity and mom_age_level are categories based on mom_age). Excluding these relationships, the strongest correlations (R>=0.7) are between dad_age and mom_age (R: 0.78) and dad_age and mom_age_level (0.75). These two correlations depict the same relationship, since mom_age_level is based on mom_age, an indicate that in our dataset infants with older mothers are likely to also have older fathers, while infants with younger mothers are likely to have younger fathers.
## Warning: Removed 171 rows containing non-finite values (stat_smooth).
## Warning: Removed 171 rows containing missing values (geom_point).
There were several medium strength correlations (R>=0.5) within our dataset including len_preg and bwt (R: 0.67), len_preg and low_bwt (R: -0.59), is_premie and bwt (-0.56), and is_premie and low_bwt (R: 0.56) indicating that longer pregnancies correspond to higher birthweight within our sample. dad_age and maturity had a coefficient of 0.50, but this relationship is better represented by the stronger correlation between dad_age and mom_age (due to the greater amount of data available in the variable mom_age as compared with maturity).
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Removed 2 rows containing missing values (geom_point).
The low strength correlations (R>=0.3) observed included dad_age and marital (R: -0.35), mom_age and marital (-0.44), mom_age_level and marital (-0.44), and mom_white and marital (-0.33), indicating unmarried mothers in our sample are more likely to be young and not white, with a young father.
Power tests show that with our sample size of 1000, we have a 100% chance of detecting an effect of r=0.3 or greater.
pwr.r.test(r=.3, n=1000, sig.level=.05) # power = 1
##
## approximate correlation power calculation (arctangh transformation)
##
## n = 1000
## r = 0.3
## sig.level = 0.05
## power = 1
## alternative = two.sided
I next performed Student’s t-Test to further investigate interactions between the variables. These tests revealed that male infants have a slightly higher birth weight on average than female infants (0.40 lb difference, p: 2.77e-05).
The tests also confirmed that older mothers were more likely to be married (p: < 2.2e-16), as were older fathers (p: < 2.2e-16).
## Warning: Removed 170 rows containing non-finite values (stat_boxplot).
Married mothers visited the hospital an average of 2 times more during pregnancy than unmarried mothers did (p: 2.50e-12).
## Warning: Removed 8 rows containing non-finite values (stat_boxplot).
The average birth weight for infants of married mothers was slightly higher (0.50 lb, p:8.52e-07), than for unmarried mothers.
Smoking was somewhat related to age, with smoking mothers being an average of 2 years younger than nonsmoking mothers (p: 0.00025).
Nonsmoking mothers had infants with a birthweight on average 0.32 lb heaver than infants born to smoking mothers (p: 0.019).
White mothers tended to be an average of 2 years older (2.43e-07), gain an average of 2.3lb more weight during pregnancy, have a 0.64 week (4.5 day) longer average pregnancy (p: 0.0079), and have infants born an average of 0.53lb heavier than non-white mothers (p: 1.93e-06).
Premature infants had mothers who gained an average of 5.33lb less during pregnancy (p: 1.37e-05), visited the doctor 1.61 times less on average (p: 0.0001), and weighed an average of 2.33lb less than fullterm infants (p-value < 2.2e-16).
## Warning: Removed 26 rows containing non-finite values (stat_boxplot).
## Warning: Removed 8 rows containing non-finite values (stat_boxplot).
In our sample, infants in the low birth weight group were more likely to have mothers who gained less weight during pregnancy (p: 0.0018), visited the doctor less (p: 0.0029), and had shorter pregnancies (p: < 2.2e-16).
## Warning: Removed 27 rows containing non-finite values (stat_boxplot).
## Warning: Removed 9 rows containing non-finite values (stat_boxplot).
## Warning: Removed 2 rows containing non-finite values (stat_boxplot).
Power analysis on the t-tests showed that in the maturity, premature, low birth weight and smoking variables, we had about a 60% chance of detecting a small effect (d = .2) with a t-test. This low power is probably due to the skewed samples with far fewer observations in the advanced age, premature infant, low birth weight and smoking mother categories. In our more balanced samples for marital groups, sex and race we had about an 85% chance of detecting a small effect size with a t-test.
ANOVA tests showed that the mother’s age level did not have a statistically significant effect on birth weight (Pr(>F) = 0.165). Since the confidence levels of all comparisons within this ANOVA contained 0, we cannot reject the null hypothesis of no difference in birth weight between mother age levels.
However, further ANOVA testing showed statistically significant differences between mother’s age level and number of visits to the hospital during pregnancy (Pr(>F) = 2.38e-07). A Tukey HSD showed statistically significant differences in mean number of visits between early20s-teens, late20s-teens, early30s-teens, 35+-teens, late20s-early20s and early30s-early20s as the confidence levels for these comparisons did not contain the null hypothesis of 0. The largest differences between groups were between late20s-teens, early30s-teens and 35+-teens due to teen mothers visiting the hospital an average of 2.3 times less than mothers in these older age groups.
## Warning: Removed 9 rows containing non-finite values (stat_boxplot).
Performance of chi-squared tests showed significant relationships between maturity vs marital (p: 9.91e-07), is_premie vs marital (p: 0.0078), is_premie vs low_bwt (p: < 2.2e-16), is_premie vs mom_white (p: 0.018), marital vs low_bwt (p: 0.00019), marital vs smoke (p: 0.001), marital vs mom_white (p: < 2.2e-16) and low_bwt vs mom_white (p: 0.015).
Since mom_age_level is an ordinal variable, I used the Kruskal-Wallis test instead of the chi-squared test and found significant relationships between marital vs mom_age_level (p: < 2.2e-16), smoke vs mom_age_level (p: 0.013) and mom_white vs mom_age_level (p: 4.84e-09).
This confirmed that in our sample older mothers were more likely to be married, white and/or non-smokers, while premature infants were more likely to be born to unmarried mothers and/or non-white mothers, and to have low birth weight. Unmarried mothers were more likely to be associated with low birth weight infants, smoking, and being non-white.
Power analysis showed that our chi-squared tests had an 89% chance of detecting a small effect size.
summary(fit_bwt_final)
##
## Call:
## lm(formula = bwt ~ len_preg + marital + mom_wt_gain + sex + smoke +
## mom_white, data = births)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.7613 -0.6582 -0.0263 0.6872 4.2173
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.977147 0.462464 -12.925 < 2e-16 ***
## len_preg 0.329042 0.011961 27.509 < 2e-16 ***
## maritalunmarried -0.265136 0.075729 -3.501 0.000485 ***
## mom_wt_gain 0.009258 0.002431 3.809 0.000149 ***
## sexmale 0.378405 0.068971 5.486 5.24e-08 ***
## smokesmoker -0.388843 0.104758 -3.712 0.000218 ***
## mom_whitewhite 0.212131 0.081324 2.608 0.009236 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.071 on 963 degrees of freedom
## (30 observations deleted due to missingness)
## Multiple R-squared: 0.4869, Adjusted R-squared: 0.4837
## F-statistic: 152.3 on 6 and 963 DF, p-value: < 2.2e-16
My final model finds that the variables pregancy length, mother’s marital status, mother’s weight gain during pregnancy, sex of baby, smoking status of mother, and race of mother can explain about half of the observed variation in birth weight, within our sample (Adjusted R-squared: 0.4837). The variables that have the greatest effect on the infant’s birth weight are length of pregnancy (fitted coefficient: 0.33),sex of the infant (fitted coefficient: 0.38) and smoking status of the mother (fitted coefficient: -0.39).
This model has a fairly high degree of precision, with a residual standard error of 1.071 on 963 degrees of freedom. I arrived at this model by beginning with a model which included all variables, and then removing those variables causing multicollinearity with the aid of VIF summaries. To avoid overfitting the model, I then removed the least significant variables from the model one at a time, until only statistically significant variables were left (p < 0.05). While doing this I monitored the adjusted R-squareds to ensure the model’s goodness of fit did not deteriorate noticablely with the removal of each variable.
Plotting our model’s residuals vs fitted showed that the residuals seem to be evenly distributed above and below the regression line, indicating a good model. The lack of pattern in the residuals vs fitted plot indicates that there are no nonlinear trends in the data unnaccounted for by our model. Cook’s distance does not appear on the residuals vs leverage plot, indicating that outliers are not having a significant influence on the model.
In our sample male infants’ birth weight increased by 0.38 pounds on average compared to female infants, when all other variables are held constant. This finding is unsurprising, as in humans, males generally weight more than females at birth (Kumar et al. 2013).
I also found that mothers classified as smokers have babies whose birth weight decreased by 0.39 pounds on average compared to babies of nonsmoker mothers, when all other variables are held constant. This finding is consistent with previous studies which have found maternal smoking to be associated with a decrease in birth weight (Suzuki et al. 2008, Kataoka et al. 2018).
Mother’s weight gain during pregnancy was found to positively affect birth weight. For every 1 pound increase in mother’s weight gain during pregnancy, birth weight increased by 0.01 pounds on average, when all other variables are held constant. Though not all mothers gain weight during pregnancy, previous studies have found that in general gestational weight gain is associated with a higher birth weight (Deputy et al. 2015)
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Removed 27 rows containing missing values (geom_point).
Length of pregnancy had a clear positive linear association with birth weight, with a one week increase in length of pregnancy being associated with a 0.34 pound increase in birth weight on average, when all other variables are held constant.
Overall, this model found that length of pregnancy, gestational weight gain, white mothers and male infants were positively related to birth weight, while unmarried and smoking mothers were negatively related to birth weight. The variable with the strongest potential effect on birth weight was length of pregnancy. There is a strong possiblity that the influence of mother’s marriage status, race and smoking status on birth weight are related to the latent variable of socio-economic status. Low socio-economic status is associated with low birth weight (Parker et al. 1994, Bublitz et al. 2016), and has been found to correlate with being a single mother (Meyer & Sullivan, 2004), smoking (Hiscock et al. 2011), and race (Braveman at al. 2010).
summary(fit_lowbwt_final)
##
## Call:
## glm(formula = low_bwt ~ len_preg + marital, family = binomial(),
## data = births)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.9834 -0.3012 -0.2132 -0.1479 3.4403
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 23.63838 2.24101 10.548 <2e-16 ***
## len_preg -0.70286 0.06131 -11.464 <2e-16 ***
## maritalunmarried 0.66923 0.27075 2.472 0.0134 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 692.56 on 997 degrees of freedom
## Residual deviance: 396.08 on 995 degrees of freedom
## (2 observations deleted due to missingness)
## AIC: 402.08
##
## Number of Fisher Scoring iterations: 6
pR2(fit_lowbwt_final)
## llh llhNull G2 McFadden r2ML
## -198.0396885 -348.6009843 301.1225915 0.4319015 0.2604594
## r2CU
## 0.5181017
My final logistic regression model finds that the variables pregancy length and mother’s marital status are statistically significant predictors (p < 0.05) for the variation of birth weight in our sample. The residual deviance (396.08 on 995 degrees of freedom) is much lower than the null deviance (692.56 on 997 degrees of freedom), meaning our model explains much of the deviance in birth weight. McFadden’s pseudo r-squared is 0.43, implying that our model explains just under half of the variation in birth size.
I arrived at this model by beginning with a model which included all variables, and then removing those variables causing multicollinearity with the aid of VIF summaries. I then removed the least significant variables from them model one at a time, until only statistically significant variables were left (p < 0.05). While doing this I monitored the difference between null and residual deviance, ensuring it did not decrease noticablely with the removal of each variable.
exp(coef(fit_lowbwt_final))
## (Intercept) len_preg maritalunmarried
## 1.845089e+10 4.951682e-01 1.952728e+00
The high intercept odds ratio (1.85e+10) means that all other variables held constant, within our sample a baby is much more likely to be in the notlow birth weight group than the low birth weight group. This is because we have roughly 8 times as many not low infants in our sample as low weight infants.
# plot(fit_lowbwt_final, 1)
plot(fit_lowbwt_final)
Plotting the residuals of the model reveals that the residuals are somewhat evenly distributed around the regression line. Plotting residuals vs leverage shows that no points cross Cook’s distance, indicating that no outliers are exerting an undue effect on the model.
When the mother is unmarried, the log odds of the infant having low birth weight increase by 0.67 units on average as compared with married mothers, holding all other variables constant. For unmarried mothers as opposed to married mothers, the odds of having a low birth weight baby are multiplied by exp(coef)=exp(1.952728e+00)=1.95 (ie: an increase of 95%) on average, when all other variables in the model are held constant.
For every one week increase in length of pregnancy, the log odds of the infant having low birth weight decrease by 0.70 units on average, when all other variables in the model are held constant. For every one week increase in length of pregnancy, the odds of having a low birth weight infant are multiplied by exp(coef)=4.951682e-01=0.5 (ie: a decrease of 50%) on average, when all other variables in the model are held constant. The odds ratio of len_preg is 0.50, meaning mothers are 50% less likely to have a low birth weight infant for each week incrase in length of pregnancy.
b + geom_boxplot(aes(x=low_bwt, y=len_preg, fill = low_bwt)) +
ggtitle("Birth Weight Group vs. Pregnancy Length") +
xlab("Baby's Birth Weight Group") + ylab("Length of Pregnancy (Weeks)") +
theme_classic() + scale_fill_discrete(name="Birth Weight Group")
## Warning: Removed 2 rows containing non-finite values (stat_boxplot).
While this model found that both length of pregnancy and marriage status of mother were significant predictors for low birth weight in our dataset, length of pregnancy had a much greater overall effect on birth weight than marriage status.
As a sensitivity check, to ensure that the missing observations did not have a large effect on the model results, I created an imputed verision of the dataset. I did this by replacing missing numeric variables with Predictive Mean Matching and missing two-level categorical variables with logistic regression predictions, and then re-ran the linear and logistic regression on the imputed dataset. When re-running the regressions, I repeated my original process by starting with all non-related independant variables and removing variables without significant fitted coefficients until I arrived at a model with only significant independant variables.
# start with all non related independant variables, and remove variables without significant fitted coefficients
# summary(lm(bwt~dad_age+len_preg+num_visits+marital+mom_wt_gain+sex+smoke+mom_white, data=imputed_births))
# summary(lm(bwt~dad_age+len_preg+marital+mom_wt_gain+sex+smoke+mom_white, data=imputed_births))
summary(lm(bwt~len_preg+marital+mom_wt_gain+sex+smoke+mom_white, data=imputed_births))
##
## Call:
## lm(formula = bwt ~ len_preg + marital + mom_wt_gain + sex + smoke +
## mom_white, data = imputed_births)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.7473 -0.6594 -0.0231 0.6752 4.2138
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.161860 0.450209 -13.687 < 2e-16 ***
## len_preg 0.333626 0.011657 28.620 < 2e-16 ***
## maritalunmarried -0.248430 0.074600 -3.330 0.000900 ***
## mom_wt_gain 0.008999 0.002393 3.761 0.000179 ***
## sexmale 0.358567 0.068139 5.262 1.74e-07 ***
## smokesmoker -0.370056 0.103504 -3.575 0.000367 ***
## mom_whitewhite 0.234580 0.080545 2.912 0.003667 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.074 on 993 degrees of freedom
## Multiple R-squared: 0.496, Adjusted R-squared: 0.4929
## F-statistic: 162.9 on 6 and 993 DF, p-value: < 2.2e-16
The final imputed linear regression model included the same variables as the un-imputed model, and had an adjusted R-squared almost identical to that of the un-imputed model (imputed 0.4931 vs unimputed 0.4837).
# summary(glm(low_bwt~dad_age+mom_age_level+len_preg+num_visits+marital+mom_wt_gain+sex+smoke+mom_white, family=binomial(), data=imputed_births))
# summary(glm(low_bwt~dad_age+mom_age_level+len_preg+num_visits+marital+mom_wt_gain+sex+smoke, family=binomial(), data=imputed_births))
# summary(glm(low_bwt~dad_age+len_preg+num_visits+marital+mom_wt_gain+sex+smoke, family=binomial(), data=imputed_births))
#summary(glm(low_bwt~dad_age+len_preg+num_visits+marital+sex+smoke, family=binomial(), data=imputed_births))
#summary(glm(low_bwt~dad_age+len_preg+num_visits+marital+smoke, family=binomial(), data=imputed_births))
#summary(glm(low_bwt~dad_age+len_preg+marital+smoke, family=binomial(), data=imputed_births))
#summary(glm(low_bwt~len_preg+marital+smoke, family=binomial(), data=imputed_births))
summary(glm(low_bwt~len_preg+marital, family=binomial(), data=imputed_births))
##
## Call:
## glm(formula = low_bwt ~ len_preg + marital, family = binomial(),
## data = imputed_births)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.9859 -0.3009 -0.2128 -0.1475 3.4427
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 23.68514 2.23550 10.595 <2e-16 ***
## len_preg -0.70415 0.06117 -11.511 <2e-16 ***
## maritalunmarried 0.66974 0.27076 2.474 0.0134 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 697.20 on 999 degrees of freedom
## Residual deviance: 396.25 on 997 degrees of freedom
## AIC: 402.25
##
## Number of Fisher Scoring iterations: 6
pR2(glm(low_bwt~len_preg+marital, family=binomial(), data=imputed_births))
## llh llhNull G2 McFadden r2ML
## -198.1226674 -348.6009843 300.9566338 0.4316635 0.2598901
## r2CU
## 0.5176854
The final imputed logistic regression model also included the same variables as the un-imputed model, and had a McFadden’s pseudo R-squared almost identical to that of the un-imputed model (imputed 0.4312289 vs unimputed 0.4319015).
Based on the similarities between the imputed and unimputed models, I concluded that the missing data did not have a significant influence on the models, increasing my level of confidence in the original models.
Braveman, P. A., Cubbin, C., Egerter, S., Williams, D. R., & Pamuk, E. (2010). Socioeconomic disparities in health in the United States: what the patterns tell us. American journal of public health, 100 Suppl 1(Suppl 1), S186–S196. doi:10.2105/AJPH.2009.166082
Bublitz, M. H., Vergara-Lopez, C., O’Reilly Treter, M., & Stroud, L. R. (2016). Association of Lower Socioeconomic Position in Pregnancy with Lower Diurnal Cortisol Production and Lower Birthweight in Male Infants. Clinical therapeutics, 38(2), 265–274. doi:10.1016/j.clinthera.2015.12.007
Deputy, N. P., Sharma, A. J., & Kim, S. Y. (2015). Gestational Weight Gain - United States, 2012 and 2013. MMWR. Morbidity and mortality weekly report, 64(43), 1215–1220. doi:10.15585/mmwr.mm6443a3
Hiscock, R., Bauld, L., Amos, A., Fidler, J. A., & Munafò, M. (2011). Socioeconomic status and smoking: a review. Annals of the New York Academy of Sciences, 1248(1), 107–123. doi: 10.1111/j.1749-6632.2011.06202.x
Kataoka, M. C., Carvalheira, A. P. P., Ferrari, A. P., Malta, M. B., Carvalhaes, M. A. D. L., & Parada, C. M. G. D. (2018). Smoking during pregnancy and harm reduction in birth weight: a cross-sectional study. BMC Pregnancy and Childbirth, 18(1). doi: 10.1186/s12884-018-1694-4
Kumar, V. S., Jeyaseelan, L., Sebastian, T., Regi, A., Mathew, J., & Jose, R. (2013). New birth weight reference standards customised to birth order and sex of babies from South India. BMC Pregnancy and Childbirth, 13(1), 1–8. doi: 10.1186/1471-2393-13-38
Meyer, B. D., & Sullivan, J. X. (2004). The effects of welfare and tax reform: the material well-being of single mothers in the 1980s and 1990s. Journal of Public Economics, 88(7-8), 1387–1420. doi: 10.1016/s0047-2727(02)00219-0
Parker, J. D., Schoendorf, K. C., & Kiely, J. L. (1994). Associations between measures of socioeconomic status and low birth weight, small for gestational age, and premature delivery in the United States. Annals of Epidemiology, 4(4), 271–278. doi: 10.1016/1047-2797(94)90082-5
Suzuki, K., Tanaka, T., Kondo, N., Minai, J., Sato, M., & Yamagata, Z. (2008). Is maternal smoking during early pregnancy a risk factor for all low birth weight infants?. Journal of epidemiology, 18(3), 89–96. doi:10.2188/jea.je2007415